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Abstract 

We study probabilistic single-item second-price 
auctions where the item is characterized by a set 
of attributes. The auctioneer knows the actual in- 
stantiation of all the attributes, but he may choose 
to reveal only a subset of these attributes to the 
bidders. Our model is an abstraction of the fol- 
lowing Ad auction scenario. The website (auc- 
tioneer) knows the demographic information of its 
impressions, and this information is in terms of a 
list of attributes (e.g., age, gender, country of loca- 
tion). The website may hide certain attributes from 
its advertisers (bidders) in order to create thicker 
market, which may lead to higher revenue. We 
study how to hide attributes in an optimal way. 
We show that it is NP-hard to solve for the op- 
timal attribute hiding scheme. We then derive a 
polynomial-time solvable upper bound on the op- 
timal revenue. Finally, we propose two heuristic- 
based attribute hiding schemes. Experiments show 
that revenue achieved by these schemes is close to 
the upper bound. 



1 Introduction 

One advantage of Internet advertising is that it offers adver- 
tisers the ability to target customers based on various traits 
such as demographics. [ Even-Par et al, 2007) showed that, 
for sponsored search of a given keyword, instead of running a 
single auction for the keyword, we can split the whole auction 
into many separate auctions based on visitors/impressions' 
contexts (e.g., demographics). For example, if we know and 
only know the visitors' locations, then each location defines a 
context. In this example scenario, splitting based on context 
means separate auction for each location. Splitting based on 
context increases the advertisers' welfare. The explanation is 
simple: after splitting, advertisers can tailor their bids to the 
context. As a result, advertisers generally only win (impres- 
sions from) visitors that they aim to target, and the payments 
are also lower, since advertisers only face competition from 
those targeting similar visitors. On the other hand, splitting 
may reduce the revenue received by the auctioneer (publisher, 
e.g., website) due to the thin market problem: there may be 



few competitors for some contexts. Actually, if for every con- 
text, there is only one advertiser interested in it, then the total 
revenue is under the standard second-price auction. 

iGhosh et al, 2 007 1 observed that having a single auction 
for all contexts and having separate auction for each con- 
text are not the only two options. There are other ways to 
split based on context, and it may lead to much higher rev- 
enue. The idea explored in |Ghosh et al, 200 7] is to cluster 
the contexts into bundles, and run separate auction for each 
bundle. For example, suppose there are three different con- 
texts: Beijing, Chicago, and London (assuming the only con- 
textual information is the location and visitors are only from 
these three cities). We can have one auction for the bundle 
Beijing and Chicago (and a second auction for London only). 
The interpretation (due to II Emek et al, 2012) ) is that if a vis- 
itor is from Beijing or Chicago, then the auctioneer informs 
the advertisers that the impression is from one of these two 
cities, but not exactly which. When this happens, both ad- 
vertisers targeting Beijing and advertisers targeting Chicago 
will compete in the auction. Their bids depend on how much 
they value impressions from Beijing and Chicago, respec- 
tively. Their bids also depend on the conditional probability 
that the impression is from Beijing (or Chicago) given that 
the impression is from one of these two cities. 

To put it more formally, I Gh osh et al., 2 007 1 studied prob- 
abilistic single-item second-price auctions (again, interpreta- 
tion due to ]Emek et ah, 2012) ). In such an auction, there 
is only one item for sale under a second-price auction, but 
the item has different possible instantiations. The auc- 
tioneer knows the actual instantiation but the bidders do 
not. The auctioneer may choose to hide certain information 
from the bidders if this increases the revenue. The prob- 
abilistic single-item second-price auction model is an ab- 
straction of the following Ad auction scenario. We have a 
website that sells one advertisement slot. That is, there is 
only one item - the only advertisement slot, but the item 
takes many possible instantiations, due to the fact that vis- 
itors/impressions have different demographic profiles. The 
auctioneer knows every visitor's demographic profile, and 
he may hide certain information from the advertisers. As 
mentioned above, |Ghosh et al., 2007 1 considered hiding in- 
formation by clustering: the auctioneer tells the bidders 
that the actual instantiation is among several instantiations. 
pmek et al.720l2, .Bro Miltersen and Sheffet, 2012j studied 



the exact same model and went one step further. These two 
papers studied hiding information by signaling: the auction- 
eer sends out different signals, and the bidders infer the prob- 
ability distribution of the actual instantiation, based on the 
signal received. It is easy to see that signaling is more gen- 
eral than clustering. Interestingly, for full information set- 
tings (settings where the auctioneer knows the bidders' ex- 
act valuations), | |Ghosh et ai, 2007) showed that it is NP- 
hard to solve for the optimal clustering scheme (optimal in 
terms of revenue). On the other hand, | ,Emek et ai, 2012) 
|Bro Miltersen and Sheffet, 2012| both independently showed 
that, under the same full information assumption, it takes only 
polynomial time to solve for the optimal signaling scheme. 
This is mostly due to the fact that instantiations are treated as 
divisible goods under signaling schemes. 

In this paper, we continue the study of revenue-maximizing 
probabilistic single-item second-price auctions. We observe 
that in practice. Ad impressions are categorized based on mul- 
tiple attributes. Given this, we argue that the most natural 
way to hide information is by hiding attributes. For example, 
let there be three attributes, each with two possible values: 

• Age: Teenager, Adult 

• Gender: Male, Female 

• Location: US, Non-US 

Together there are 2^ possible instantiations. Under the 
clustering scheme studied in I Ghosh etal., 2007], the web- 
site is allowed to hide information by bundling any subset 
of instantiations. However, not all bundles are natural. For 
example, consider the bundle {(Teenager, Male, US), (Adult, 
Female, Non-US)}. By creating this bundle, the website basi- 
cally may tell the advertisers that a visitor is either a teenage 
US male or an adult Non-US female. This does not appear 
natural. The signaling scheme studied in | |Emek et ai, 20121 
|Bro Miltersen and Sheffet, 2012[ is even more general than 
clustering, so it may also lead to unnatural bundles. 

On the other hand, attribute hiding always leads to natural 
bundles. For example, the website may hide the location at- 
tribute. That is, if the actual instantiation is (Teenager, Male, 
US), then the website may inform the advertisers that the vis- 
itor is a teenage male. By hiding the location attribute, we es- 
sentially created a bundle (Teenager, Male, ?), which consists 
of both (Teenager, Male, US) and (Teenager, Male, Non-US). 

Based on the above example, it is easy to see that attribute 
hiding is clustering with a particular structure. It should be 
noted that this relationship between attribute hiding and clus- 
tering does not mean previous results on clustering apply to 
our model. For example, one of the two main results from 
l |Ghosh et ai, 2007) is a constructed clustering scheme that 
guarantees one half of the optimal revenue (and one half of 
the optimal social welfare). The construction does not apply 
to our model since it generally leads to unnatural bundles. 

In this paper, we first show that it is NP-hard to solve 
for the optimal attribute hiding scheme 01 We then derive a 



polynomial-time solvable upper bound on the optimal rev- 
enue. Finally, we propose two heuristic -based attribute hiding 
schemes. Experiments show that revenue achieved by these 
schemes is close to the upper bound. 

Besides the aforementioned related work in the computer 
science literature, bundling has also been well-studied in the 
economics literature. I Palfrey, 1983) observed that for small 
numbers of bidders, a revenue-maximizing auctioneer may 
choose to bundle the items, and this makes bidders univer- 
sally worse-off. On the other hand, for large numbers of 
bidders, the auctioneer may choose to unbundle the items, 
and this hurts the high-demand bidders while benefiting the 
low-demand bidders. |Chakraborty, 199 91 quantitatively an- 
alyzed the bundling behavior of the auctioneer The result is 
that under a Vickrey auction, for each pair of objects, there 
is a unique critical number If there are fewer bidders than 
this number, the seller chooses to bundle the items, and vice 
versa. [Avery and Hendershott, 2000) studied more sophisti- 
cated bundling policy, including bundling with discounts and 
probabilistic bundling (the probability of bundling occurring 
depends on the bids). 

2 Model Description 

There is a single item for sale characterized by k attributes 
(attribute 1 to k). Attribute i has Ci possible values, rang- 
ing from to Ci — 1. m is the total number of possible in- 
stantiations, m = YliCi- In this paper, when we mention 
polynomial time or NP-hardness, we mean in terms of m. 
An instantiation whose i-th attribute equals ai is written as 

(ai,a2, as, . . . ,ak) 

The space of all possible instantiations fl is 

{0, . . . , Ci - 1} X {0, . . . , C2 - 1} X . . . X {0, . . . , Cfc - 1} 

Definition 1. A natural bundle h is an element from the fol- 
lowing set of all natural bundles (denoted by B): 

{0, . . . , ?}x{0, . . . , C2-I, ?}x. . .x{0, . . . , Cu-l, ?} 

Natural bundles are bundles of instantiations resulting from 
hiding attributes. An attribute of a natural bundle either 
takes a specific value, or is represented by a question mark, 
which means that this attribute is hidden. For example, let 
fc = 5, given the instantiation (01,02,03,04,05), if we 
hide attributes 1 and 3, then it results in the natural bundle 
(?, 02, ?, 04, 05). This bundle has size C1C3. As another ex- 
ample, every instantiation itself corresponds to a natural bun- 
dle of size 1 (no attribute hidden). An instantiation uj belongs 
to a natural bundle h if and only if for every attribute, either 
Lo and b share the same attribute value, or the attribute is hid- 
den for h. Unlike the total number of arbitrary bundles, which 
equals 2"*, the total number of natural bundles is polynomial 
in TO, as shown below: 
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'We mentioned earlier that iGhosh et ai, 2007 1 proved a similar 
result. The authors showed that it is NP-hard to solve for the optimal 
clustering scheme. It should be noted that our NP-hardness result is 
not implied by this earlier result, which relied on reduction involv- 
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ing unnatural bundles. Actually, our requirement on bundles being 
natural greatly adds to the difficulty of the reduction, and our proof 
is based on completely new techniques. 



The probabilities' of different instantiations are based on 
a publicly known distribution A(ri). To simplify the presen- 
tation, when discussing bidders' valuations, we factor in the 
probabilities. For example, if bidder i values w at 5 when oj is 
the actual instantiation, and lj happens with probability 0.1, 
then we say bidder i's valuation for uj is 0.5. 

Let n be the number of bidders. Let Vi{uj) be 
bidder i's (expected) valuation for instantiation uj. 
Following I Ghosh etai, 2007] |Emek ef a/.,"2012| 
|Bro Milterseliand Sheffet, 2012fl we assume full in- 
formation: the auctioneer knows the bidders' true valuations. 
Again, following previous models, we only consider bidders 
with additive valuations. That is, bidder i's valuation for 
bundle b, denoted by Vi{b), equals J^i^eb "^ii^)- 

Following previous models, the auction is the Vickrey auc- 
tion. We use 2{b) to denote the revenue for selling 6 as a bun- 
dle. 2{b) is the second highest value in {vi{b)\l < i < n}. 

Definition 2. An attribute hiding scheme is a way to cluster 
the instantiations into natural bundles. An attribute hiding 
scheme is characterized by a set of bundles 62, • ■ • , &t}, 
satisfying 

• All bundles are natural: bi £ B for 1 < i <t 

• The bundles are disjoin^]: for every pair of bi and bj, 
there exists an attribute, so that for this attribute, bi and 
bj take different values (neither is ?). 

Under the attribute hiding scheme {61, 62, • • • , ^t}, instantia- 
tions covered by bi will have their attributes hidden to match 
bi. Essentially, instantiations in bi are sold in a bundle. In- 
stantiations not covered by any bi are sold without hiding at- 
tributes (sold separately as natural bundles of size 1). 

Under attribute hiding scheme {61, 62, ■ • • , bt], the revenue 
of the auctioneer equals 

^ 2{b,) + 2(0.) 
i<i<t weo-Ui<i<t6i 

We introduce another function r. For b E B, r{b) rep- 
resents the extra revenue obtained by selling 6 as a bundle, 
rather than selling instantiations in b separately. We have 

r(6)=2(6)~^2(a;) 

The revenue of the auctioneer can then be rewritten as 

r(6.) + E2H 
i<i<t uen 

The second term of the above expression does not depend 
on the attribute hiding scheme. Therefore, the problem of 
designing optimal attribute hiding scheme is equivalent to 
the problem of searching for a set of disjoint natural bundles 
{&i, 62, ■ • ■ , bt), so that X]i<i<t ''(^i) is maximized. 

^Besides the full information setting, (Emek et al., 2012] also 
discussed the more general Bayesian setting. 

^If under an attribute hiding scheme, two different natural bun- 
dles share one common instantiation, then for this instantiation, it is 
not clear which attributes we should hide. 



3 Hardness Result 

Previously, (Ghosh et al, 2 007] showed that it is NP-hard 
to solve for the optimal clustering scheme. The proof was 
by reduction from 3-partition: given 3z integers, determine 
whether it is possible to partition them into z groups with 
equal sums. In this section, we prove a similar result. We 
show that it is also NP-hard to solve for the optimal attribute 
hiding scheme. Our proof is by reduction from monotone 
one-in-three 3SAT I jSchaefer, 1978| . Monotone one-in-three 
3SAT is a variant of 3SAT. Monotone means that the liter- 
als are just variables, never negations. One-in-three means 
that the determination problem is to see whether there is an 
assignment so that for each clause, exactly one literal is true. 
We emphasize again that our result is not implied by the hard- 
ness result from [ SGhosh et al., 2007) . 

Theorem 1. It is NP-hard to solve for the optimal attribute 
hiding scheme. 

Proof. Let us consider the following monotone one-in-three 
3SAT instance with D clauses: 

(a;/(i) V a;/(2) V a;/(3)) A (a;/(4) V Xf^5) V a;/(6)) A . . . 

• ■ • A (a;/(3D-2) V X/(3i>_i) V Xf(^3D)) 

There are 31? literals, and they are from a list of E vari- 
ables (cci to xe, /'s range is between 1 and E). According 
to ISchaefer, 1978 1, it is NP-complete to determine whether 
there exists an assignment of the Xi, so that the 3SAT instance 
is true, and for each clause, there is exactly one true literal. 

We will construct a probabilistic single-item auction sce- 
nario with m possible instantiations and n bidders. Both m 
and n are polynomial in E. We will show that for the con- 
structed scenario, if we are able to solve for the optimal at- 
tribute hiding scheme in polynomial time (in m), then we 
are able to determine the above 3SAT instance in polynomial 
time (in E). This implies that it is NP-hard to solve for the 
optimal attribute hiding scheme. 

Our construction is as follows. Let the number of attributes 
fcbe \\og2{D)~\ + [log2(i?)] + 11. All attributes are binary. 
The total number of instantiations m is polynomial in E as 
shown below. 

= 8192DE < 8192^;"* 

Our proof relies on the following seven families of natural 
bundles (Family [T] to |7]i: 

(e,d,0,?,?,0,l,0,l,0,l,0,l) (1) 

(e,d,?, 0,7,0,1,0, 1,0, 1,0,1) (2) 

(e,d,?, 7,0,0,1, 0,1, 0,1, 0,1) (3) 

(6,7,0,0,0,7,7,0,1,0,1,0,1) (4) 

(7, d, 1,7, 7, 0,1, 7, 7, 0,1, 0,1) (5) 

(7, d, 7, 1,7, 0,1, 0,1, 7, 7, 0,1) (6) 

(7, d, ?,?,!, 0,1, 0,1, 0,1, 7, 7) (7) 

In the above, e is the binary representation of integer e 
(1 < e < E). The representation's width is [log2(i?)]. Simi- 
larly, d is the binary representation of integer d{l < d < D). 



The representation's width is [log2(-D)]. Finally, ? is ? re- 
peated [log2(-E)] times (Family |5] |6] and|2ll or [log2(-D)] 
times (Family nil. 

We recall that the problem of designing optimal attribute 
hiding scheme is equivalent to the search of disjoint natural 
bundles {bi, &2, ■ • ■ , bt}, so that X]i<i<t ^(^i) is maximized. 
Given a natural bundle b, r{b) depends on the bidders' val- 
uations. We will construct a set of bidders, so that for any 
natural bundle b, r{b) = by default. The exceptions are: 

• For i = 1,2,3, we use &'(e,d) to represent the nat- 
ural bundle characterized by e and d in Family i. 
r{b^{e, d)) = 1 if and only if, in the 3SAT instance, vari- 
able e appears in the i-th position of clause d. 

• We use b*{e) to represent the natural bundle character- 
ized by e in Family H) Let #e be the number of times 
variable e appears in the 3SAT instance. It is without 
loss of generality to assume < D (no literal appears 
twice in a clause). Let r{b'^{e)) = #e(l — e). Here, e is 
a constant that is less than The idea is to make sure 
that #e(l - e) > #e - L 

• We use 6^ (d) to represent the natural bundle character- 
ized by din Family |5] r{b^{d)) = 3. 

• We use b^{d) to represent the natural bundle character- 
ized by d in Family |6] r{b^{d)) — 3. 

• We use 6^(c?) to represent the natural bundle character- 
ized by d in Family|7] r{b'^ {d)) — 3. 

For now, we simply assume that it is possible to construct 
a polynomial number of bidders, so that the values of r{b) for 
different b are indeed as described above. We will provide the 
specific construction toward the end. 

Let O be an optimal attribute hiding scheme corresponding 
to the above construction. If r{b) = 0, then it is without loss 
of generality to assume b ^ O. Therefore, we can ignore 
bundles not in the above seven families. Some bundles from 
Family [T] to [3] can also be ignored for the same reason. For 
presentation purposes, we call the remaining bundles helpful 
bundles. A bundle b is helpful if and only if r{b) > 0. 

Let us consider a fixed variable e (1 < e < E). e appears 
#e times in the 3SAT instance, so there are exactly #e pairs 
of d(l < d < D) andi {1 < i < 3), so that b'{e, d) is helpful. 
We use 6e 1, 6e 2, • ■ • , &e.#e to denote these helpful bun- 
dles. They are the only helpful bundles that intersect b'^{e). 
If some of these bundles are not in O, then none of them is in 
O. The reason is that r(6^(e)) = #e(l — e) > #e — 1, so it 
is better off to add b^{e) into O (and push out fog 1 to f'e,#e if 
they are in O). In summary, for e from 1 to E, we must have 
one of the following two: 

• be,i, 6e,2, • • • 7 ^e,#e are all in O. 6'*(e) is not in O. 

• None of 6e,i, &e,2, ■ • ■ , &e,#e is in O. b'^{e) is in O. 

Let T be the set of e values where fee 1, 6^ 2, • • • , ^e,#e 
are all in O. Let F be the set of e values where none of 
be,i, be,2, ■ ■ ■ , be,#e is in O. We use O1234 to denote the set 
of helpful bundles in O that belong to Family [T]to|4] We have 

J2 r(6) = ^#e + ^#e(l-6) 



= ^ E + E - + E #^(1 - 

eeT eeT eeF 

= e^#e + (l-e)3i? 

Let us then consider a fixed variable d(l < d < D), b^{d), 
b^{d), and pair-wise intersect. Therefore, in O, at most 
one of them can appear Actually, exact one of them appears. 
If none of them appears in O, then we can add &^ (d) into O, 
which results in higher revenue. Let 62 and 63 be the second 
and third variables in clause d of the 3SAT instance. The 
only helpful bundles b^ (d) intersects with are b^ (e2 , d) and 
6^(63, d). By removing these two from O (if they are in O 
to start with) and adding 6^(d) into O, the revenue increases. 
Therefore, for any d from 1 to D, O contains exactly one 
of {6^(d), 6''(d), 6''(d)}. We use O^er to denote the set of 
helpful bundles in O that belong to Family |5]to|7] We have 

E r{b)=3D 

Hence, 

Y^m^ J2 rib)+J2 r(6)^6^#e+(2-6)3D 

beO beOi234 beOser eeT 

Let d be a specific value between 1 and D. If b^{d) be- 
longs to O, then among helpful bundles characterized by d 
from Family [T] to |3] the only helpful bundle that can coexist 
with b^{d) is 6^(ei, d), where ei is the first variable in clause 
d of the 3SAT instance. In general, no matter which among 
{6^(d), 6^(d), 6^(d)} appears in O, among helpful bundles 
characterized by d from Family [1] to [3] there is at most one 
that can be in O. Therefore, the total number of helpful bun- 
dles from Family [T]to|3]in O is at most D. we have 

E#e^^ 

eGT 

r{b) ^ e Y #e+(2-e)3i:' < eD+{2-€)3D = 6D-2De 

beO eeT 

If we are able to solve for the optimal attribute hiding 
scheme in polynomial time, then we are also able to deter- 
mine in polynomial time whether J^beO ^(^) i^ equal to the 
upper bound 6D — 2De. If they are equal, then we have a 
satisfactory assignment of the 3SAT instance. For variable e, 
5e,i to fee,#e determine whether e is true or not. If they are all 
in O, then e is set to be true. Otherwise (if none of them is 
in O), e is set to be false. When the upper bound is reached, 
X^esT "f^^ ~ which implies that under the above assign- 
ment, there are exactly D true literals. Next, we show that two 
true literals cannot appear in the same clause. That is, there 
is exactly one true literal for each clause under the assign- 
ment, and all clauses are satisfied (there are D true clauses). 
Given d, let the variables in clause d be ei, 62, £3. 6^(ei, d), 
b^{e2,d), and 6^(63, d) are all helpful bundles. We proved 
that among helpful bundles characterized by d from Family[T] 
to [3] there is at most one that can be in O. Therefore, only 
one of 6^(ei, d), 6^(e2, d), ^'^(63, d) can be in O. That is, only 
one of ei , 62 , 63 is set to be true. 



The other direction can be shown similarly. If there is a sat- 
isfactory assignment of the 3SAT instance, then J^beo ^(^) 
should match the upper bound 6D — 2De. 

In conclusion, for the constructed auction setting, it is NP- 
hard to determine whether the optimal revenue J^beo ^(^) + 
E.eo 2(^) reaches 6D - 2De + E^^o 2(w)- 

Finally, we still need to show that it is possible to construct 
a polynomial number of bidders, so that the values of r{b) 
are exactly as described above. Due to space constraint, we 
present the construction and omit the proof. 

• We construct two bidders who both value every instanti- 
ation equally, and the valuation for every instantiation is 

L{L> D). 

• For every helpful bundle b, we construct two new bid- 
ders. By default, both bidders value all instantiations in 
b at L and value all instantiations outside of b at 0. The 
exceptions are that one bidder values instantiation 6I7 at 
r{b) + L and the other bidder values instantiation b\\ at 
r{b) + L. Here, 5|| is the instantiation resulting from 
replacing all ? in 6 by y. 

□ 

4 Tree-Structured Attribute Hiding Schemes 

In this section, we study a special family of attribute hiding 
schemes, which we call the tree-structured schemes. 

Let & be a non-unit natural bundle (bundle of size greater 
than 1). For b, at least one attribute is hidden. Let x be one of 
the hidden attributes of h. We can split b into Cx disjoint nat- 
ural bundles by revealing attribute x. The resulting bundles 
are , b]]., . . . , represents the natural bundle ob- 

tained by replacing the x-th attribute of 6 by i. If 6 belongs to 
an attribute hiding scheme O, then after splitting b, the new 
scheme becomes 

{0~{b})u{b\lb\l...,b\^--'} 

It is easy to see that the new scheme is still feasible (the bun- 
dles remain disjoint). 

Tree-structured attribute hiding schemes are results 
of recursive splitting (revealing attribute) starting from 
{(?,?,...,?)}. At every step, we either terminate and keep 
the current scheme, or pick a non-unit bundle from the current 
scheme, and split (reveal) one of its attributes. 

Definition 3. An attribute hiding scheme O is tree-structured 
if and only if it satisfies one of the following: 

• O = {(?, ?,..., ?)}: the scheme is simply hiding all 
attributes and selling all instantiations in a single bundle. 

• There exists a tree-structured attribute hiding scheme O' . 
There exists a bundle b E O' whose x-th attribute is 
hidden. After splitting b by revealing attribute x, the 
resulting scheme is equivalent to OQ 

Let us consider an example with three binary attributes. 
{(?, ?, ?)} is, by definition, a tree-structured attribute hiding 
scheme. Starting from {(?, ?, ?)}, if we pick (?, ?, ?) and re- 
veal its second attribute, then we get 



(?,?,?) 



(?,0,?) (?,!,?) 

The leaves {(?, 0, ?),(?, 1, ?)} characterize a new tree- 
structured attribute hiding scheme. If we further split the first 
bundle (?, 0, ?) based on its thkd attribute, then we get 



(?,0,?) 



(?,0,0) (?,0,1) 

Again, the leaves {(?, 0, 0), (?, 0, 1), (?, 1, ?)} characterize a 
new tree-structured attribute hiding scheme. 

Proposition 1. If there are at most two attributes, then all 
attribute hiding schemes are tree-structured^ 
Proposition 2. If there are at least three attributes, then there 
exist attribute hiding schemes that are not tree-structured. 

Proof. We construct the following natural bundles. For i 
from 1 to fc, let fe/s i-th attribute be hidden, let fe/s ((i mod 
fc) + l)-th attribute be 1, and let 6i's all other attributes be 0. 

61 = (?,1,0,0,...,0,0) 

62 = (0,?,1,0,...,0,0) 

63 = (0,0,?,!,. ..,0,0) 

= (0,0,0,0,...,?,!) 

bk = (1,0,0,0,. ..,0,?) 

The bi are disjoint, {fei, 62, ■ • ■ , ^fe} is not tree-structured be- 
cause starting from (?,?,...,?), if we ever reveal an attribute 
(e.g., attribute x), then bx cannot be in the final scheme. □ 

As we mentioned earlier, tree-structured attribute hiding 
schemes are results of recursive splitting starting from the 
bundle of all instantiations. At every step, we either termi- 
nate or split a non-unit bundle in some way. For every natural 
bundle 6, let t{b) be the optimal revenue for selling instanti- 
ations in b, as a result of making optimal recursive splitting 
decisions on h. t{{7, ?,...,?)) is then the optimal revenue of 
tree-structured attribute hiding schemes. Given a bundle, we 
either sell it as a whole, or split it in some way as a first step. 
Let h{b) be the set of hidden attributes of 6. We have 



t{b) — max{2(6), max 

xeh{b) 



E 

0<j<C^ 



If b has size 1, then h{b) = 0. That is, for unit bundles, 
t{b) = 2(6). Giventhevaluesof i(&) forall6with \h{b)\ = y, 
we can then easily compute the values of t{b) for all b with 
\h{b)\ = y + 1. The total number of natural bundles \B\ 
is polynomial in m. For every b, t{b) is the maximum of 
at most fc + 1 values, which is at most logj m + 1. There- 
fore, the optimal revenue t{{7, ?,...,?)) can be computed in 
polynomial time. The corresponding optimal scheme can be 
obtained along the way. 



Two schemes are equivalent if they share the same set of non- 
unit bundles. 



""This proposition implies that if there are at most two attributes 
(m can still be large), then we can solve for the optimal attribute hid- 
ing scheme in polynomial time, because it must be tree-structured. 



5 Upper Bound and Weighted Matching 

Our objective is to find a set of disjoint natural bundles, de- 
noted by O, which maximizes X]beo^(^)- ^™ model 
it as an integer program. We introduce \B\ binary variables. 
For b E B, let Zf, be a binary variable. If zi, — 1, then it 
means b E O. The number of binary variables \B\ is polyno- 
mial in m. The objective is to maximize J2beB ^br{b). The 
constraints are that bundles in O are disjoint. That is, for 
61, &2 G B, if 61 and 62 intersect, Zf,^ + z^^ < 1. The number 
of constraints is at most \B\'^, which is polynomial in m. In 
summary, the optimal revenue can be solved for based on an 
integer program with polynomial numbers of variables and 
constraints. One upper bound can then be solved for in poly- 
nomial time if we consider the linear relaxation (replacing 
binary variables by non-integer variables). 

Some preprocessing can vastly reduce the number of vari- 
ables in the above program. We first observe that, by defini- 
tion, r(b) — for all b with size 1. That is, we can safely 
set Zf, = for all b with size 1. We then observe that, for 
any natural bundle b with size greater than 1, if the following 
expression is true, then it means that instead of selling 6 as a 
single bundle, we can achieve higher revenue by recursively 
splitting it, in which case we can safely set z^ — 0. 



For each instantiation, bidders' valuations are drawn indepen- 
dently from U{0, 1). For every setup, we repeat 100 times and 
report the averages. 



2(6) < max 

xeh{b} 



0<i<C^~l 



mi) 



In Section |6] our simulation shows that when computing 
the upper bound, the above observations indeed vastly reduce 
the number of variables in the linear program. For example, 
for settings with 10 binary attributes and 10 bidders, origi- 
nally, there are as many as (2 + l)^*' = 59049 variables. Af- 
ter preprocessing, there are only 220.28 variables on average 
over repeated simulations. 

We then discuss another heuristic for generating attribute 
hiding schemes with high revenue. This heuristic only applies 
to settings where all attributes are binary. If all attributes are 
binary, then a natural bundle with only one attribute hidden 
contains exactly two instantiations. The heuristic is based on 
maximum weighted matching. We view all instantiations as 
vertices. If two instantiations can be merged into a natural 
bundle b, and r{b) > 0, then we create an edge with weight 
r{b) between them. Maximum weighted matching can be 
solved in polynomial time. The matching result character- 
izes the optimal attribute hiding scheme under the additional 
constraint that at most one attribute is hidden^ 

6 Experiments 

In this section, we evaluate the performances of the proposed 
heuristic -based attribute hiding schemes. For different val- 
ues of k, C, and n, we construct problem instances with k 
attributes, each attribute taking C possible values, and n b_id- 
ders. The total number of possible instantiations is then 



*In Section|6l our simulation shows that there are generally very 
few natural bundles with at least two hidden attributes cannot be 
recursively split to achieve higher revenue. This somewhat justifies 
the heuristic requirement that at most one attribute is hidden. 



Setup 


Tree 


Match 


UB 


#Opt 


#Var 


HM 


k = n — 3 














C = 2 


13.33 


11.58 


15.42 


47 


5.82 


1.08 


k = n — 5 














(7 = 2 


3.953 


3.810 


4.354 


35 


15.8 


1.54 


fc = n = 10 














C = 2 


0.836 


0.927 


0.950 





220.28 


4.76 


k = n = 3 














(7 = 3 


9.251 


NA 


10.58 


25 


13.28 


0.96 


k = n — 5 














(7 = 3 


1.767 


NA 


1.976 





45.39 


0.3 


k = n = 8 














(7 = 3 


0.296 


NA 


0.361 





326.18 


0.01 



The table fields are described below: 

• Tree, Match, UB: Comparing to selling all instantia- 
tions separately, the extra revenue in terms of percent- 
age. Tree is short for optimal tree-structured scheme. 
Match is short for optimal scheme leased on maximum 
weighted matching (only applies to (7 = 2). UB is short 
for upper bound on the optimal revenue. 

• #Opt: Among 100 repeated simulations, how many 
times one of the heuristic-based schemes reaches the up- 
per bound (therefore guarantees optimalitjO). 

• #Var: How many variables are in the linear program for 
computing upper bound. 

• HM: How many natural bundles with at least two hidden 
attributes cannot be recursively split to achieve higher 
revenue. 



7 Future Research 

Given the fact that it is NP-hard to solve for the optimal 
attribute hiding scheme, one direction of future research is 
to study whether there are heuristic -based attribute hiding 
schemes that guarantee a constant fraction of the optimal rev- 
enue. A similar direction is to see how much revenue we 
lose by not allowing unnatural bundles. A preliminary result 
shows that the optimal revenue by clustering (allowing un- 
natural bundles) can be as high as twice the optimal revenue 
by hiding attributes. The construction is as follows. There 
are m instantiations and m bidders. Bidder i only values in- 
stantiation i positively. Let instantiation 1 be (0,0,..., 0) 
and bidder I's valuation for it be y- Let instantiation m 
be (1,1,..., 1) and bidder m's valuation for it be For 
1 < i < m, let bidder i's valuation for instantiation ihe 1. 
With this setup, the optimal revenue by clustering is ^™^"^ . 
The optimal revenue by hiding attributes is y- The ratio 
approaches 2 for large m. 



^Even if the heuristic-based schemes do not reach the upper 
bound, they may still possibly be optimal. 
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